Low-Complexity Heuristics for Deriving Fine-Grained Classes of Named Entities from Web Textual Data

نویسنده

  • Marius Pasca
چکیده

We introduce a low-complexity method for acquiring fine-grained classes of named entities from the Web. The method exploits the large amounts of textual data available on the Web, while avoiding the use of any expensive text processing techniques or tools. The quality of the extracted classes is encouraging with respect to both the precision of the sets of named entities acquired within various classes, and the labels assigned to the sets of named entities.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fine-grained Arabic named entity recognition

Named Entity Recognition (NER) is a Natural Language Processing (NLP) task, which aims to extract useful information from unstructured textual data by detecting and classifying Named Entity (NE) phrases into predefined semantic classes. This thesis addresses the problem of fine-grained NER for Arabic, which poses unique linguistic challenges to NER; such as the absence of capitalisation and sho...

متن کامل

SANE: System for Fine Grained Named Entity Typing on Textual Data

Assignment of fine-grained types to named entities is gaining popularity as one of the major Information Extraction tasks due to its applications in several areas of Natural Language Processing. Existing systems use huge knowledge bases to improve the accuracy of the fine-grained types. We designed and developed SANE, a system that uses Wikipedia categories to fine grain the type of the named e...

متن کامل

Heuristic Methods for Reducing Errors of Geographic Named Entities Learned by Bootstrapping

One of issues in the bootstrapping for named entity recognition is how to control annotation errors introduced at every iteration. In this paper, we present several heuristics for reducing such errors using external resources such as WordNet, encyclopedia and Web documents. The bootstrapping is applied for identifying and classifying fine-grained geographic named entities, which are useful for ...

متن کامل

Entity Analysis with Weak Supervision: Typing, Linking, and Attribute Extraction

Entity Analysis with Weak Supervision: Typing, Linking, and Attribute Extraction Xiao Ling Chair of the Supervisory Committee: Professor Daniel S. Weld Computer Science and Engineering With the advent of the Web, textual information has grown at an explosive rate. To digest this enormous amount of data, an automatic solution, Information Extraction (IE), has become necessary. Information extrac...

متن کامل

Fine-Grained Classification of Named Entities by Fusing Multi-Features

Due to the increase in the number of classes and the decrease in the semantic differences between classes, fine-grained classification of Named Entities is a more difficult task than classic classification of NEs. Using only simple local context features for this fine-grained task cannot yield a good classification performance. This paper proposes a method exploiting Multi-features for fine-gra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008